AITopics | attention stage

Collaborating Authors

attention stage

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pop-out vs. Glue: A Study on the pre-attentive and focused attention stages in Visual Search tasks

Beukelman, Hendrik, Rodrigues, Wilder C.

arXiv.org Artificial IntelligenceDec-14-2024

Success in these tasks depends on factors like awareness, cognitive abilities, and the nature of the search itself. Some studies have explored the complexities of visual search, focusing on asymmetry, where locating target A among distractors B is easier than finding B among A. Our research specifically examines the asymmetry between finding an oblique line among straight lines versus a straight line among oblique lines. Anne Treisman's study (Treisman & Gelade, 1980) [3] found that certain features, like colour, are more easily detected than others, such as orientation. Further, Treisman & Gormican (1988) [4] showed that identifying a vertical target among oblique distractors took longer than identifying an oblique target among vertical distractors, this supports the idea that a basic feature enhances detection. We aim to replicate these findings with the following research question: Does searching for an oblique target among vertical distractors result in search asymmetry, and vice versa? We anticipate a'pop-out' effect when participants search for an oblique target among vertical distractors, suggesting a parallel search. As opposed to a serial search pattern in the reverse condition. Consistent with Treisman & Gormican's findings [4], we predict faster identification of oblique targets, aligning with the'pop-out' effect, while vertical targets will require focused attention ('glue' phase), particularly as distractor numbers increase.

artificial intelligence, information management, participant, (18 more...)

arXiv.org Artificial Intelligence

2412.12198

Country:

Europe > Netherlands > Utrecht > Utrecht (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.35)

Technology:

Information Technology > Information Management > Search (0.73)
Information Technology > Artificial Intelligence > Cognitive Science (0.49)

Add feedback

AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild

Park, Junho, Kong, Kyeongbo, Kang, Suk-Ju

arXiv.org Artificial IntelligenceJul-25-2024

Recently, there has been a significant amount of research conducted on 3D hand reconstruction to use various forms of human-computer interaction. However, 3D hand reconstruction in the wild is challenging due to extreme lack of in-the-wild 3D hand datasets. Especially, when hands are in complex pose such as interacting hands, the problems like appearance similarity, self-handed occclusion and depth ambiguity make it more difficult. To overcome these issues, we propose AttentionHand, a novel method for text-driven controllable hand image generation. Since AttentionHand can generate various and numerous in-the-wild hand images well-aligned with 3D hand label, we can acquire a new 3D hand dataset, and can relieve the domain gap between indoor and outdoor scenes. Our method needs easy-to-use four modalities (i.e, an RGB image, a hand mesh image from 3D label, a bounding box, and a text prompt). These modalities are embedded into the latent space by the encoding phase. Then, through the text attention stage, hand-related tokens from the given text prompt are attended to highlight hand-related regions of the latent embedding. After the highlighted embedding is fed to the visual attention stage, hand-related regions in the embedding are attended by conditioning global and local hand mesh images with the diffusion-based pipeline. In the decoding phase, the final feature is decoded to new hand images, which are well-aligned with the given hand mesh image and text prompt. As a result, AttentionHand achieved state-of-the-art among text-to-hand image generation models, and the performance of 3D hand mesh reconstruction was improved by additionally training with hand images generated by AttentionHand.

attentionhand, hand image, mesh image, (15 more...)

arXiv.org Artificial Intelligence

2407.18034

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Power Law Graph Transformer for Machine Translation and Representation Learning

Gokden, Burc

arXiv.org Artificial IntelligenceJun-27-2021

We present the Power Law Graph Transformer, a transformer model with well defined deductive and inductive tasks for prediction and representation learning. The deductive task learns the dataset level (global) and instance level (local) graph structures in terms of learnable power law distribution parameters. The inductive task outputs the prediction probabilities using the deductive task output, similar to a transductive model. We trained our model with Turkish-English and Portuguese-English datasets from TED talk transcripts for machine translation and compared the model performance and characteristics to a transformer model with scaled dot product attention trained on the same experimental setup. We report BLEU scores of $17.79$ and $28.33$ on the Turkish-English and Portuguese-English translation tasks with our model, respectively. We also show how a duality between a quantization set and N-dimensional manifold representation can be leveraged to transform between local and global deductive-inductive outputs using successive application of linear and non-linear transformations end-to-end.

attention stage, graph transformer model, transformer model, (13 more...)

arXiv.org Artificial Intelligence

2107.02039

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Oregon (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback